St. Petersburg
- North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (4 more...)
- North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (5 more...)
- Research Report > New Finding (0.46)
- Instructional Material > Course Syllabus & Notes (0.46)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (7 more...)
- Education (0.46)
- Leisure & Entertainment > Sports (0.46)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > San Diego County > Carlsbad (0.04)
- North America > United States > Virginia (0.04)
- (12 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Law (0.67)
- Information Technology (0.46)
These rare, giant millipedes only exist in Florida
When a graduate found a baby Florida scrub millipede, she put it in a kiddie pool. Then it got busy reproducing. Breakthroughs, discoveries, and DIY tips sent six days a week. While Florida is perhaps best known for its beaches and wetlands, its landscape hosts other notable features: ridges . Millions of years ago, sea levels were higher than they are today, and these elevated areas of land became like islands.
- Europe > United Kingdom > Wales (0.06)
- Asia > India (0.06)
- North America > United States > Oregon (0.05)
- (4 more...)
'People thought I was a communist doing this as a non-profit': is Wikipedia's Jimmy Wales the last decent tech baron?
'People thought I was a communist doing this as a non-profit': is Wikipedia's Jimmy Wales the last decent tech baron? In an online landscape characterised by doom and division, the people's encyclopedia stands out - a huge collective endeavour giving everyone free access to the sum of human knowledge. But with Elon Musk branding it'Wokipedia' and AI looming large, can it survive? W ikipedia will be 25 years old in January. Jimmy Wales's daughter will be 25 and three weeks. It's not a coincidence: on Boxing Day 2000 Wales's then wife, Christine, gave birth to a baby girl, but it quickly became clear that something wasn't right. She had breathed in contaminated amniotic fluid, resulting in a life-threatening condition called meconium aspiration syndrome. An experimental treatment was available at the hospital near where they lived in San Diego. Did they want to try it?
- Europe > United Kingdom > Wales (0.29)
- North America > United States > California > San Diego County > San Diego (0.24)
- Europe > Ukraine (0.05)
- (8 more...)
- Media > News (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Sports (0.94)
- (2 more...)
Discrepancy Detection at the Data Level: Toward Consistent Multilingual Question Answering
Calvo-Bartolomé, Lorena, Aldana, Valérie, Cantarero, Karla, de Mesa, Alonso Madroñal, Arenas-García, Jerónimo, Boyd-Graber, Jordan
Multilingual question answering (QA) systems must ensure factual consistency across languages, especially for objective queries such as What is jaundice?, while also accounting for cultural variation in subjective responses. We propose MIND, a user-in-the-loop fact-checking pipeline to detect factual and cultural discrepancies in multilingual QA knowledge bases. MIND highlights divergent answers to culturally sensitive questions (e.g., Who assists in childbirth?) that vary by region and context. We evaluate MIND on a bilingual QA system in the maternal and infant health domain and release a dataset of bilingual questions annotated for factual and cultural inconsistencies. We further test MIND on datasets from other domains to assess generalization. In all cases, MIND reliably identifies inconsistencies, supporting the development of more culturally aware and factually consistent QA systems.
- Europe > France (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Virginia (0.04)
- (25 more...)
Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification?
Richter, Cedric, Wehrheim, Heike
Automatic software verifiers have become increasingly effective at the task of checking software against (formal) specifications. Yet, their adoption in practice has been hampered by the lack of such specifications in real world code. Large Language Models (LLMs) have shown promise in inferring formal postconditions from natural language hints embedded in code such as function names, comments or documentation. Using the generated postconditions as specifications in a subsequent verification, however, often leads verifiers to suggest invalid inputs, hinting at potential issues that ultimately turn out to be false alarms. To address this, we revisit the problem of specification inference from natural language in the context of automatic software verification. In the process, we introduce NL2Contract, the task of employing LLMs to translate informal natural language into formal functional contracts, consisting of postconditions as well as preconditions. We introduce metrics to validate and compare different NL2Contract approaches, using soundness, bug discriminative power of the generated contracts and their usability in the context of automatic software verification as key metrics. We evaluate NL2Contract with different LLMs and compare it to the task of postcondition generation nl2postcond. Our evaluation shows that (1) LLMs are generally effective at generating functional contracts sound for all possible inputs, (2) the generated contracts are sufficiently expressive for discriminating buggy from correct behavior, and (3) verifiers supplied with LLM inferred functional contracts produce fewer false alarms than when provided with postconditions alone. Further investigations show that LLM inferred preconditions generally align well with developers intentions which allows us to use automatic software verifiers to catch real-world bugs.
- Europe > Germany > Lower Saxony > Oldenburg (0.40)
- Europe > Austria > Vienna (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- (22 more...)
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Li, Xiaoxi, Jin, Jiajie, Dong, Guanting, Qian, Hongjin, Wu, Yongkang, Wen, Ji-Rong, Zhu, Yutao, Dou, Zhicheng
Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders their ability to produce comprehensive research reports requiring synthesis of diverse web information. To address this, we propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate among web pages, and draft reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabling LRMs to dynamically search, navigate, and extract information from the web when encountering knowledge gaps. It also employs an Autonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time. To further enhance research tool utilization, we introduce an RL-based training strategy via iterative online Direct Preference Optimization (DPO). Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems. Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems. The code is available at https://github.com/RUC-NLPIR/WebThinker.
- Europe > Austria > Vienna (0.14)
- Asia > Southeast Asia (0.04)
- Asia > Singapore (0.04)
- (14 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis
Huang, Zhechong, Zhang, Zhao, Ji, Ruyi, Xia, Tingxuan, Zhu, Qihao, Cao, Qinxiang, Sun, Zeyu, Xiong, Yingfei
Language models have shown remarkable proficiency in code generation; nevertheless, ensuring type correctness remains a challenge. Although traditional methods, such as constrained decoding, alleviate this problem by externally rejecting untypable code, the model itself does not effectively learn type reasoning internally, which ultimately limits its overall performance. This paper introduces TyFlow, a novel system that internalizes type reasoning within code generation to guide the model to learn the type system. The core of our approach is a novel type-guided program synthesis system that maintains an isomorphism between type derivation trees and synthesis derivation trees, enabling a new code representation based on synthesis decision sequences rather than traditional text-based token sequences. By offloading the complexity of type system learning to the representation itself, models can redirect their computational resources toward higher-level program semantics. Our evaluation shows that TyFlow not only eliminates type errors but also significantly improves functional correctness, highlighting the importance of aligning LMs with type systems internally.
- Europe > Austria > Vienna (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (16 more...)
- Workflow (0.94)
- Research Report (0.82)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.83)
- (2 more...)